Choosing the best solution for the LAN storage crunch

By Nick Blozan

More users... more killer applications... more corporate files that seem to grow on their own. It's little wonder network managers see the "out of disk space" message flash across their screens so frequently. Because of these storage-hungry data loads, organizations are regularly doubling and tripling their hard-drive storage capacity.

It seems as though every time you have your storage requirements/capacity back in balance, new applications are added. The future promises even greater challenges as you add linked object, voice and data, and, ultimately, interactive service applications. Just as work will fill the available time, users will fill the capacity with "vital" information and will continue to require additional online storage.

Budget-conscious network managers and support personnel have attempted to conquer the beast. They clean up directories and ask users to delete old files. But files keep growing and the demand for more client-server storage capacity keeps increasing.

To meet the demand, a new breed of LAN software -- hierarchical storage management (HSM) -- is being heralded as the solution to the network manager's nightmare. HSM products are designed to optimize storage device use by locating files on appropriate media based on user-specified criteria. The goal is to provide ready access to vast amounts of storage and reduce the cost of maintaining volumes of corporate LAN data. But installation can be difficult and the cost high. Archiving provides a cost-effective alternative to HSM.

Before network managers choose a solution to their LAN storage crunch, they should ask a few questions. What is the nature of their files? What is the average size of the files? How active are they? And what will be the effects on end users?

Decade-old HSM solution

HSM has been around in the mainframe and minicomputer arenas for more than a decade. It is only now migrating to the LAN environment. It operates under the principle that everything should be online all of the time. However, some files should be more online than others.

HSM systems manage the online capacity by moving data around inside a managed storage repository, dynamically migrating, relocating and recalling files.

Eighty percent of corporate data is rarely accessed, according to industry analyst Michael Peterson of Santa Barbara, Calif.-based Peripheral Strategies. Of the 80 percent of files that haven't been accessed in the past 90 days, more than 70 percent won't be accessed in 120 days. If data hasn't been accessed in 120 days, less than 10 percent of it will ever be accessed again.

As a result, industry analysts caution that organizations shouldn't blindly jump on the HSM bandwagon. They are urging organizations to closely evaluate their real file storage and retrieval requirements to determine whether HSM or archiving is the best solution.

The key difference between HSM and archiving is that with HSM, the files are moved to another medium while archived files are copied to another medium. Some storage management software supports both archiving and a feature called grooming, which actually deletes the file from the hard disk after storing the file on a protected piece of media.

The primary reason to examine an organization's options is cost. Instead of buying new hard disk drives to accommodate the growing number of files, firms can store the bulk of their files on other media at a fraction of the cost.

In addition to investigating basic media costs, management also has to consider the cost of tracking files and optimizing online, nearline and offline LAN storage. These costs can run from $4 to $9 per megabyte per year. The bigger the network, the higher the annual data storage maintenance bill.

Current HSM offerings rely on a high-watermark/low-watermark system of migrating files. If a disk drive reaches a predefined high-watermark of 80 percent to 85 percent full, the software will migrate files until the disk reaches a low-watermark of 60 percent, or a similar predefined level.

LAN administrators can determine which files are eligible for migration and which are not. If the organization employs a three-tier HSM solution, files can be moved to an even lower, slower and more economical tier (e.g. tape) based on a second set of time or file-size criteria. At the same time, LAN administrators can tailor their migration program to meet the organization's specific objectives or needs.

For example, system files and executables can be excluded from the migration equation. However, files that haven't been accessed for 30 or 60 days or files that exceed a certain size can be migrated to maximize storage use/response.

While HSM is a cost-effective solution to expanding an organization's storage capacity, it isn't free. The complete system requires an investment in storage media (optical jukeboxes and tape libraries), a dedicated HSM server and the software. Depending on the storage capacity, optical jukeboxes can range from less than $7,000 for 26-Gb units to $90,000 for 450-Gb subsystems. DAT libraries can range from $2,000 for 8-Gb drives to $6,000 for 48-Gb units. HSM software costs $7,000 to $10,000.

HSM has been used in the legacy systems arena since the mid-1970s and is a mainstay in Unix environments. Early users have reported a number of problems, including difficult and time-consuming installations, errors that render some stored files useless and compatibility problems with LAN operating systems and backup software. Backup compatibility is an important issue because many HSM products don't come with backup facilities.

HSM isn't backup

Because both HSM and backup software use tape or optical drives and both copy data to these media, it is understandable that they are often discussed at the same time. However, there are distinct differences. Backup software is an application that completes a function (copying) and is then inactive until the next backup is scheduled. Backup is used to make an insurance copy of data for use when the hard disk fails. For added insurance, a complete set of backups is usually stored offsite.HSM, on the other hand, operates as a part of the network services under the presumption that an organization has a lot of data to store and most of it is used only occasionally.

With backup applications, network administrators can control the procedure. In addition to instructing the backup when to run, administrators can monitor the results and load and unload archived media.

Archiving products are capable of protecting media so that it doesn't accidentally get erased or overwritten. The conventional backup packages are used as a safeguard against a network disaster, and backup is a key component in an organization's disaster recovery plan.

Because of the problems mentioned earlier and the high initial cost of installing an HSM system, some industry analysts have said that LAN-based HSM will not gain widespread acceptance in the market for several more years. They say that some early innovators will be willing to make the leap in faith required to allow HSM to automatically migrate files off the network's hard disks and store them on secondary removable media. Most will move more cautiously, however, using more tried-and-true archiving approaches. In addition, HSM is not for everyone. Firms have to consider the size and makeup of their LAN environments as well as the nature of their stored files.

The archiving option

Archiving offers a cost-effective alternative to HSM. With this type of system, inactive files can be groomed or moved from the hard disk to tape based on criteria established by the network manager. Hard-drive directories are reviewed on a regular basis to determine which files can be moved to tape and protected by the system indefinitely. By moving the inactive files to tape, network administrators can recover expensive hard-disk capacity for active, ongoing work.

Peripheral Strategies' Peterson feels that archiving will be easier for an organization to justify when it is dealing with about 20Gb of data.

When compared to HSM, archiving in inexpensive and easy to deploy, despite the limitation of having to manually retrieve files.

Network administrators who have examined their options often report that HSM applications are five times more difficult to set up than an archiving system.

Today's newer, more robust backup, data management and recovery solutions provide the features and capabilities many organizations need to establish an archiving program. Unlike simple backup programs, these new software packages include a full-featured catalog librarian. The system tracks files and provides specific information about the tape media where the file is stored. Finding a specific file is simplified with explicit or wildcard character-search capabilities.

Network managers can do archiving automatically or earmark files for archiving based on preset criteria. The network administrator can then review the files selected for archiving and determine if the file should remain on the hard drive or be archived.

If a file is archived, the location of the archived file is entered into a database. To access the archived file, the end user must request it from the network administrator, who can pinpoint its location by consulting the database.

These software programs typically work with tape storage devices that have capacities ranging from 1Gb to 8Gb, or optical disks with capacities of 1Gb to 1.5Gb. Prices range from $1,000 to $3,000 for a complete system, including the hardware and software.

Network managers need to consider the effect on end users when choosing between an HSM and an archiving solution. Retrieval may take minutes with archiving, or milliseconds with an HSM.

To adjust to the migration program, end users need to understand the problems that were solved and the benefits of the new storage solution. They will be more receptive to having slower access to seldom-used files if they understand the high cost of millisecond access, especially if the money saved can be invested in other areas such as enhanced object-oriented applications or voice/video capabilities.

Nick Blozan is a product marketing manager at Mountain Network Solutions Inc. in Scotts Valley, Calif. He can be reached at 408-439-3204.